5 research outputs found
Statistical Challenges in Online Controlled Experiments: A Review of A/B Testing Methodology
The rise of internet-based services and products in the late 1990's brought
about an unprecedented opportunity for online businesses to engage in large
scale data-driven decision making. Over the past two decades, organizations
such as Airbnb, Alibaba, Amazon, Baidu, Booking, Alphabet's Google, LinkedIn,
Lyft, Meta's Facebook, Microsoft, Netflix, Twitter, Uber, and Yandex have
invested tremendous resources in online controlled experiments (OCEs) to assess
the impact of innovation on their customers and businesses. Running OCEs at
scale has presented a host of challenges requiring solutions from many domains.
In this paper we review challenges that require new statistical methodologies
to address them. In particular, we discuss the practice and culture of online
experimentation, as well as its statistics literature, placing the current
methodologies within their relevant statistical lineages and providing
illustrative examples of OCE applications. Our goal is to raise academic
statisticians' awareness of these new research opportunities to increase
collaboration between academia and the online industry
Optimal Supersaturated Designs for Lasso Sign Recovery
Supersaturated designs, in which the number of factors exceeds the number of
runs, are often constructed under a heuristic criterion that measures a
design's proximity to an unattainable orthogonal design. Such a criterion does
not directly measure a design's quality in terms of screening. To address this
disconnect, we develop optimality criteria to maximize the lasso's sign
recovery probability. The criteria have varying amounts of prior knowledge
about the model's parameters. We show that an orthogonal design is an ideal
structure when the signs of the active factors are unknown. When the signs are
assumed known, we show that a design whose columns exhibit small, positive
correlations are ideal. Such designs are sought after by the Var(s+)-criterion.
These conclusions are based on a continuous optimization framework, which
rigorously justifies the use of established heuristic criteria. From this
justification, we propose a computationally-efficient design search algorithm
that filters through optimal designs under different heuristic criteria to
select the one that maximizes the sign recovery probability under the lasso
Biological resurfacing in a canine model of hip osteoarthritis
[Figure: see text]
Tuning Parameter Selection for Penalized Estimation via R2
The tuning parameter selection strategy for penalized estimation is crucial
to identify a model that is both interpretable and predictive. However, popular
strategies (e.g., minimizing average squared prediction error via
cross-validation) tend to select models with more predictors than necessary.
This paper proposes a simple, yet powerful cross-validation strategy based on
maximizing squared correlations between the observed and predicted values,
rather than minimizing squared error loss. The strategy can be applied to all
penalized least-squares estimators and we show that, under certain conditions,
the metric implicitly performs a bias adjustment. Specific attention is given
to the lasso estimator, in which our strategy is closely related to the relaxed
lasso estimator. We demonstrate our approach on a functional variable selection
problem to identify optimal placement of surface electromyogram sensors to
control a robotic hand prosthesis
Recommended from our members
Evaluation of the geometric accuracy of computed tomography and microcomputed tomography of the articular surface of the distal portion of the radius of cats.
ObjectiveTo evaluate accuracy of articular surfaces determined by use of 2 perpendicular CT orientations, micro-CT, and laser scanning.Sample23 cat cadavers.ProceduresImages of antebrachia were obtained by use of CT (voxel size, 0.6 mm) in longitudinal orientation (CTLO images) and transverse orientation (CTTO images) and by use of micro-CT (voxel size, 0.024 mm) in a longitudinal orientation. Images were reconstructed. Craniocaudal and mediolateral length, radius of curvature, and deviation of the articular surface of the distal portion of the radius of 3-D renderings for CTLO, CTTO, and micro-CT images were compared with results of 3-D renderings acquired with a laser scanner (resolution, 0.025 mm).ResultsMeasurement of CTLO and CTTO images overestimated craniocaudal and mediolateral length of the articular surface by 4% to 10%. Measurement of micro-CT images underestimated craniocaudal and mediolateral length by 1%. Measurement of CTLO and CTTO images underestimated mediolateral radius of curvature by 15% and overestimated craniocaudal radius of curvature by > 100%; use of micro-CT images underestimated them by 3% and 5%, respectively. Mean ± SD surface deviation was 0.26 ± 0.09 mm for CTLO images, 0.30 ± 0.28 mm for CTTO images, and 0.04 ± 0.02 mm for micro-CT images.Conclusions and clinical relevanceArticular surface models derived from CT images had dimensional errors that approximately matched the voxel size. Thus, CT cannot be used to plan conforming arthroplasties in small joints and could lack precision when used to plan the correction of a limb deformity or repair of a fracture